77 research outputs found
Studio report: sound synthesis with DDSP and network bending techniques
This paper reports on our experiences synthesizing sounds and building network bending functionality onto the Differentiable Digital Signal Processing (DDSP) system. DDSP is an extension to the TensorFlow API with which we can embed trainable signal processing nodes in neural networks. Comparing DDSP sound synthesis networks to preset finding networks and sample level synthesis networks, we argue that it offers a third mode of working, providing continuous control in real-time of high fidelity synthesizers using low numbers of control parameters. We describe two phases of our experimentation. Firstly we worked with a composer to explore different training datasets and parameters. Secondly, we extended DDSP models with network bending functionality, which allows us to feed additional control data into the network's hidden layers and achieve new timbral effects. We describe several possible network bending techniques and how they affect the sound
Friend Me Your Ears: A Musical Approach to Human-Robot Relationships.
PhDA relationship is something that is necessarily built up over time, however,
Human-Robot Interaction (HRI) trials are rarely extended beyond a single ses-
sion. These studies are insufficient for examining multi-interaction scenarios,
which will become commonplace if the robot is situated in a workplace or adopts
a role that is part of a human's routine. Long term studies that have been exe-
cuted often demonstrate a declining novelty effect. Music, however, provides an
opportunity for affective engagement, shared creativity, and social activity. This
being said, it is unlikely that a robot best equipped to build sustainable and
meaningful relationships with humans will be one that can solely play music. In
their day-to-day lives, most humans encounter machines and computer programs
capable of executing impressively complex tasks to a high standard that may
provide them with hours of engagement. In order to have anything that that
could be classed as a social relationship, the human must have the sense that
their interactions are taking place with another, a phenomenon known as social
presence. In this thesis, we examine whether the addition of simulated social
behaviours will improve a sense of believability or social presence, which, along
with an engaging musical interaction, will allow us to move towards something
that could be called a human-robot relationship. First, we conducted a large
online survey to gain insight into relationships based in regular music activ-
ity. Using these results, we designed, constructed and programmed Mortimer,
a robotic system capable of playing the drums and a responsive composition
algorithm to best meet these aims. This robot was then used in a series of
studies, one single session and two long-term, testing various simulated social
behaviours to compliment the musical improvisation. These experiments and
their results address the paucity of long-term studies both speci cally in Social
Robotics and in the broader HRI eld, and provide a promising insight into a
possible solution to generally poor outcomes in this area. This conclusion is
based upon the model of a positive human-robot relationship and the method-
ological approach of automated behavioural metrics to evaluate robotic systems
in this regard developed and detailed within the thesis.the EPSRC as part of the Media and Arts Tech-nology Doctoral Training Centre, EP/G03723X/2
Creating Latent Spaces for Modern Music Genre Rhythms Using Minimal Training Data
In this paper we present R-VAE, a system designed for the exploration of latent spaces of musical rhythms. Unlike most previous work in rhythm modeling, R-VAE can be trained with small datasets, enabling rapid customization and exploration by individual users. R-VAE employs a data representation that encodes simple and compound meter rhythms. To the best of our knowledge, this is the first time that a network architecture has been used to encode rhythms with these characteristics, which are common in some modern popular music genres
Supporting Feature Engineering in End-User Machine Learning
A truly human-centred approach to Machine Learning (ML) must consider how to support people modelling phenomena beyond those receiving the bulk of industry and academic attention, including phenomena relevant only to niche communities and for which large datasets may never exist. While deep feature learning is often viewed as a panacea that obviates the task of feature engineering, it may be insufficient to support users with small datasets, novel data sources, and unusual learning problems. We argue that it is therefore necessary to investigate how to support users who are not ML experts in deriving suitable feature representations for new ML problems. We also report on the results of a preliminary study comparing user-driven and automated feature engineering approaches in a sensor-based gesture recognition task
The challenge of feature engineering in programming for moving bodies
The design of bespoke human movement analysis and control systems by end users and other people without programming or signal processing expertise presents great opportunities for the arts, accessible interface design, games, and other domains. In this paper, we describe the challenge of feature engineering that confronts many people wishing to build such systems. We have conducted three studies exploring approaches to supporting feature engineering and investigated how such approaches may impact on system accuracy, user experience, and design outcomes. We briefly outline study outcomes that are most relevant to the workshop themes
Network Bending Neural Vocoders
Network bending [1] aims to elicit interesting creative output from generative neural networks by applying various transformations to the activations of groups of network nodes. This paper describes the investigation of how this emerging technique of ânetwork bendingâ can be used to provide novel creative control over sound synthesis networks based on the Magenta DDSP API [2] and how best to provide access to the resulting sound synthesis neural networks to creative practitioners
Examining Student Coding Behaviours in Creative Computing Lessons using Abstract Syntax Trees and Vocabulary Analysis
Creative computing is an approach to computing education which emphasises the creation of interactive audiovisual software and an art-school influenced pedagogy. Given this emphasis on Deweyâs "learning by doingâ, we set out to investigate the processes students use to develop their programs. We refer to these processes as the studentsâ âcoding behaviourâ, and we expect that understanding it will provide us with valuable information about how students learn in our creative computing classes. As existing metrics were not sufficient, we introduce a new set of quantitative metrics to describe coding behaviours. The metrics consider factors such as studentsâ vocabulary use and development, how fast and how much they alter the functionality of code over time and how they iterate on their code through text insert and delete operations. Many of our lessons involve providing students with demonstrator code which they use as a base for the development of their programs, so we use demo code as an entry point to our dataset. We look at programs students have written through developing the demo code in a dataset of over 16,000 programs. We clustered the demo code using the set of descriptive metrics. This lead to a set of clusters containing programs which are associated with distinct coding behaviours. Four was the ideal number of clusters for cluster density and separation. We found that the clusters had distinct behaviour patterns, that they were associated with different instructors and that they contained demo programs with different lengths
R-VAE: Live latent space drum rhythm generation from minimal-size datasets
In this article, we present R-VAE, a system designed for the modeling and exploration of latent spaces learned from rhythms encoded in MIDI clips. The system is based on a variational autoencoder neural network, uses a data structure that is capable of encoding rhythms in simple and compound meter, and can learn models from little training data. To facilitate the exploration of models, we implemented a visualizer that relies on the dynamic nature of the pulsing rhythmic patterns. To test our system in real-life musical practice, we collected small-scale datasets of contemporary music genre rhythms and trained models with them. We found that the non-linearities of the learned latent spaces coupled with tactile interfaces to interact with the models were very expressive and led to unexpected places in musical composition and live performance settings. A music album was recorded and it was premiered at a major music festival using the VAE latent space on stage
Generation and visualization of rhythmic latent spaces
In this paper we extend R-VAE, a system designed for the modeling and exploration of latent spaces of musical rhythms. R-VAE employs a data representation that encodes simple and compound me- ter rhythms, common in some contemporary popular music genres. It can be trained with small datasets, enabling rapid customization and exploration by individual users. To facilitate the exploration of the la- tent space, we provide R-VAE with a web-based visualizer designed for the dynamic representation of rhythmic latent spaces. To the best of our knowledge, this is the first time that a dynamic visualization has been implemented to observe a latent space learned from rhythmic patterns
Contemporary Machine Learning for Audio and Music Generation on the Web: Current Challenges and Potential Solutions
We evaluate specific Web-based technologies that can be used to implement complex contemporary Machine Learning systems for Computer Music research, in particular for the problem of audio signal generation. As a result of greater investment from large corporations including Google and Facebook in areas such as the development of Web-based, accelerated, cross-platform Machine Learning libraries, alongside greater interest and engagement from the academic community in exploring such approaches, Machine Learning is becoming much more prevalent on the Web. This could have great potential impact for Computer Music research, acting to democratise access to complex, accelerated Machine Learning technologies through increased usability and flexibility, in tandem with clear documentation and examples. However, some problems remain in relation to the creation of more complete Machine Learning pipe-lines for Music and Sound generation. We discuss some key potential challenges in this area, and attempt to evaluate some relevant solutions for developing more accessible Computer Music Machine Learning systems
- âŠ